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DETAILED ACTION 

1 . This action is responsive to the amendment received on September 5, 2005. 
Claims 1, 3-23 are pending in the application. 



Claim Rejections - 35 (JSC § 103 

2. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

3. Claims 1, 3-8, 10 and 23 are rejected under 35 U.S.C. 103(a) as being 
unpatentable Qian et al., US Patent 6721454 and further in view of Ratakonda, US 
Patent 5956026. 

As in Claims 1 and 10, Qian et al. teaches a method and computer storage 
medium with instructions for obtaining unstructured video frames ("A video sequence 2 
is input", Column 2, lines 64-65), generating segments from the shot boundaries based 
on the color dissimilarity between consecutive frames ("A color histogram technique 
may be used to detect the boundaries of the shots", Column 3, lines 42-43), extracting a 
set by processing pairs of segments ("the global motion of the video content is 
estimated 8 for each pair of frames in a shot", Column 3, lines 59-61 ) for their visual 
dissimilarity and temporal relationship, and merging the video segments by applying a 
probabilistic analysis to the extracted set to represent the video structure ("each shot is 
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summarized 16 ... events 22 are inferred from the shot summaries by a domain specific 
event inference model". Column 3, lines 6-8). While Qian et al. teaches extracting 
semantic events from unstructured video frames, they fail to show the generation of 
inter-segment color dissimilarity feature and inter-segment temporal relationship feature 
of each pair of segments as recited in the claims. In the same field of the invention, 
Ratakonda teaches a video event detection and segmentation merging method similar 
to that of Qian et al. In addition, Ratakonda further teaches the generation of inter- 
segment color dissimilarity feature and inter-segment temporal relationship feature of 
each pair of segments (Figures 1 , 5 and corresponding text). It would have been 
obvious to one of ordinary skill in the art, having the teachings of Qian et al. and 
Ratakonda before him at the time the invention was made, to modify the segment 
generation and merging techniques taught by Qian et al. to include the processing of 
each pair of segments of Ratakonda, in order to obtain not only frames, but also inter- 
segment similarity processing. One would have been motivated to make such a 
combination because layered hierarchical structure would have been obtained, as 
taught by Ratakonda. 

As in Claim 23, Qian et al. teaches generating color histograms from the 
consecutive frames and from the histograms, generating a difference signal, 
thresholding of this signal based on a mean dissimilarity over several frames to produce 
a signal representative of the existence of a shot boundary (Column 3, lines 42-50 and 
Figure 5). 
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As in Claim 3, Qian et al. teaches obtaining unstructured video frames, 
generating segments from the shot boundaries based on the color dissimilarity between 
consecutive frames, extracting a set by processing pairs of segments for their visual 
dissimilarity and temporal relationship by generating color histograms from the 
consecutive frames and from the histograms, generating a difference signal, 
thresholding of this signal based on a mean dissimilarity over several frames to produce 
a signal representative of the existence of a shot boundary (See Claim 23 rejection 
supra) and merging the video segments by applying a probabilistic analysis to the 
extracted set to represent the video structure (See Claim 1 rejection supra) and the 
difference signal to be based on a mean dissimilarity over several frames centered on 
one frame. Qian et al. fails to teach basing the number of frames used to calculate the 
difference signal on a fraction of the frame rate of video capture as recited in the claims. 
Within the field of the invention, it would be obvious to one of ordinary skill in the art to 
base the number of frames on a fraction of the frame rate (See also Image Analysis and 
Mathematic Morphology, Vol. 1, Jean Serra). One would have been motivated to make 
such a combination because a shortened time frame for calculating the difference signal 
would have been obtained. 

As in Claim 4, Qian et al. teaches morphologically transforming the thresholded 
difference signal with a pair of structuring elements to eliminate the presence of multiple 
adjacent shot boundaries ("When the difference between the histograms of two frames 
exceeds a predefined threshold, the content of the two frames is assumed to be 
sufficiently different", Column 3, lines 45-48). 
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As in Claim 5, Qian et al. teaches computing a mean color histogram for each 
segment and a visual dissimilarity feature metric from the difference between mean 
color histograms for pairs of segments (Column 3, lines 42-50 and Figure 5), 

As in Claim 6, Qian et al. teaches processing pairs of segments for a temporal 
separation between pairs of segments and for an accumulated temporal duration 
between pairs of segments ("each shot is summarized 16 ... events 22 are inferred from 
the shot summaries by a domain specific event inference model". Column 3, lines 6-8). 

As in Claim 7, Qian et al. teaches generating parametric mixture models 
(summaries created by shot summarization 16, Figure 1) to represent class-conditional 
densities of inter-segment features (based on temporal information and color analysis, 
See Claim 1 rejection supra) of the feature set and applying the merging criterion to the 
parametric mixture models (event inference 20/detected events 22, Figure 1 ). 

As in Claim 8, it is notoriously well known that queues are used to implement 
hierarchical displays. The examiner takes official notice of this teaching. It would be 
obvious to one of ordinary skill in the art to combine the use of the organizing video 
segments into hierarchies with a queue implementation. 

4. Claims 9, and 11-22 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Qian et al., US Patent 6721454 and Ratakonda, US Patent 5956026 and further in 
view of Qian et al., US Patent 6616529. 

As in Claims 9, 1 1 , 1 7-1 8 and 20, US Patent 6721 454 and Ratakonda teach a 
method and computer storage medium with instructions for obtaining unstructured video 
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frames, generating segments from the shot boundaries based on the color dissimilarity 
between consecutive frames, extracting a set by processing pairs of segments for their 
color dissimilarity and temporal relationship of each pair of segments, merging adjacent 
video segments by applying a probabilistic analysis to the extracted set to represent the 
video structure, and generating a parametric mixture model of the inter-segment 
features (See Claim 1 rejection supra). While US Patent 6721454 and Ratakonda teach 
the segmentation due to color dissimilarity, extraction due to visual dissimilarity and 
temporal relationships, merging with probabilistic analysis and generation of a 
parametric mixture model, they fail to show the probabilistic analysis to be a Bayesian 
analysis applied to the parametric mixture model, and representing the merging 
sequence in a hierarchical tree structure as recited in the claims. US Patent 6616529 
teaches a video segmentation method similar to that of US Patent 6721454 and 
Ratakonda. In addition, US Patent 6616529 further teaches the probabilistic analysis to 
be a Bayesian analysis applied to the parametric mixture model (Figure 3 and 
corresponding text in Columns 4-5), and representing the merging sequence in a 
hierarchical tree structure (Figures 2a-2g and corresponding text). It would have been 
obvious to one of ordinary skill in the art, having the teachings of US Patent 6721454 
and Ratakonda and US Patent 6616529 before him at the time the invention was made, 
to modify the segmentation with color dissimilarity and temporal relationships with a 
parametric mixture model taught by US Patent 6721454 and Ratakonda to include the 
construction of hierarchy according to probabilistic merging with Bayesian analysis of 
US Patent 6616529, in order to obtain a hierarchical representation of the frames 
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grouped by color dissimilarity and temporal relationships according to Bayesian 
probability methods of analysis. One would have been motivated to make such a 
combination because a visual representation of the segmented video would have been 
obtained, as taught by US Patent 6616529 (Column 2, lines 24-55). 

As in Claim 12, US Patent 6721454 and Ratakonda teach computing a mean 
color histogram for each segment and a visual dissimilarity feature metric from the 
difference between mean color histograms for pairs of segments (See Claim 5 rejection 
supra). 

As in Claim 13, US Patent 6721454 and Ratakonda teach processing pairs of 
segments for a temporal separation between pairs of segments and for an accumulated 
temporal duration between pairs of segments (See Claim 6 rejection supra). 

As in Claim 14, US Patent 6721454 and Ratakonda teach generating parametric 
mixture models to represent class-conditional densities of the inter-segment features 
that comprise the feature set (See Claim 7 rejection supra). 

As in Claim 15, US Patent 6721454 and Ratakonda teach performing the 
merging in a hierarchical queue by initializing the queue by introducing each feature in 
the queue with a priority of the probability of merging each corresponding pair of 
segments, depleting the queue by merging the segments if the criterion is met, and 
updating the queue based on the updated model (See Claim 8 rejection supra). 

As in Claim 16, US Patent 6721454 and Ratakonda teach representing the 
merging sequence as a hierarchical tree structure (See Claim 9 rejection supra) 
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including a frame extracted from each segment and displayed at each node of the tree 
(Column 10, line 61 - Column 11, line 6). 

As in Claim 19, US Patent 6721454 and Ratakonda teach representing the 
merging sequence as a hierarchical tree structure including a frame extracted from each 
segment and displayed at each node of the tree (See Claim 16 rejection supra). 

As in Claim 21, US Patent 6721454 and Ratakonda teach a method and for 
generating video segments from the unstructured video frames ("A video sequence 2 is 
input", Column 2, lines 64-65), by detecting shot boundaries based on the color 
dissimilarity between consecutive frames ("A color histogram technique may be used to 
detect the boundaries of the shots", Column 3, lines 42-43), extracting a feature set by 
processing pairs of segments ("the global motion of the video content is estimated 8 for 
each pair of frames in a shot", Column 3, lines 59-61 ) for their visual dissimilarity and 
temporal relationship, merging adjacent video segments by applying a probabilistic 
analysis to the feature set to represent the video structure independent of any empirical 
parameter determination ("each shot is summarized 16 ... events 22 are inferred from 
the shot summaries by a domain specific event inference model". Column 3, lines 6-8). 
While US Patent 6721454 teaches the segmentation due to color dissimilarity, 
extraction due to visual dissimilarity and temporal relationships, merging with 
probabilistic analysis and generation of a parametric mixture model, they fail to show 
generating a hierarchy having a merging sequence represented by a binary partition 
tree as recited in the claims. US Patent 6616529 teaches a video segmentation method 
similar to that of US Patent 6721454. In addition, US Patent 6616529 further teaches 



Application/Control Number: 09/927,041 Page 9 

Art Unit: 2179 

generating a hierarchy having a merging sequence represented by a binary partition 
tree (Figures 2a-2g and corresponding text). It would have been obvious to one of 
ordinary skill in the art, having the teachings of US Patent 6721454 and US Patent 
6616529 before him at the time the invention was made, to modify the segmentation 
with color dissimilarity and temporal relationships with a parametric mixture model 
taught by US Patent 6721454 to include the construction of hierarchy having a merging 
sequence represented by a binary partition tree of US Patent 6616529, in order to 
obtain a hierarchical representation of the frames grouped by color dissimilarity and 
temporal relationships. One would have been motivated to make such a combination 
because an organized visual representation of the segmented video would have been 
obtained, as taught by US Patent 6616529 (Column 2, lines 24-55). 

As in Claim 22, US Patent 6616529 teaches maximizing the a posteriori 
probability mass function of a binary random variable that represents inter-segment 
features of the video segments (Figures 2a-2g and Column 2, lines 45, et seq.j. 

Response to Arguments 

Applicant's arguments filed 9/5/05 have been fully considered but they are not 
persuasive. 

Applicant's arguments with respect to claims 1 and 10, have been considered but 
are moot in view of the new ground(s) of rejection. 

Applicant has said that Claim 3 is allowable as depending from Claim 1 , but has 
not addressed the obvious rejection of Claim 3, therefore the examiner asserts that it is 
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an admission of prior art that within the field of the invention, it would be obvious to one 
of ordinary skill in the art to base the number of frames on a fraction of the frame rate 
(See above). The examiner assumes that the applicant acknowledges this rejection of 
obviousness. One would have been motivated to make such a combination because a 
shortened or lengthened (dependent upon the value of the fraction) time frame for 
calculating the difference signal would have been obtained. 

In response to the arguments regarding claim 4, Qian does teach eliminating the 
presence of multiple adjacent shot boundaries. Even if they do not, this can be seen 
also by Ratakonda in the higher levels of the hierarchy. 

In response to the arguments regarding claim 5 and 6, Ratakonda teaches 
processing each pair of segments for dissimilarity in the same way Qian does for frames 
as seen supra. 

In response to the arguments regarding claim 7, that the references fail to show 
certain features of applicant's invention, it is noted that the features upon which 
applicant relies (i.e., "statistical models") are not recited in the rejected claim(s). 
Although the claims are interpreted in light of the specification, limitations from the 
specification are not read into the claims. See In re Van Geuns, 988 F.2d 1 181 , 26 
USPQ2d 1057 (Fed. Cir. 1993). 

In response to the arguments regarding claim 8, Qian teaches the process of 
"inserting" merges frames together, constituting a pair of segments that define the event 
and updating the model of the merged segment. Ratakonda further illustrates step d as 
seen supra. 
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Conclusion 



The prior art made of record on form PTO-892 and not relied upon is considered 
pertinent to applicant's disclosure. Applicant is required under 37 C.F.R. § 1.111(c) to 
consider these references fully when responding to this action. The documents cited 
therein teach similar video segment merging techniques. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Sara M. Hanne whose telephone number is (571) 272- 
41 35. The examiner can normally be reached on M-F 7:30am-4:00pm, off on 
alternating Fridays. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, WEILUN LO can be reached on (571) 272-4847. The fax phone number for 
the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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